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- The MAILING DATE of this communication appears on the cover sheet with the correspondenc address - 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
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4) ^ Claim(s) 1-21 is/are pending in the application. 
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5) n Claim(s) is/are allowed, 

6) IEI Claim(s) 1-21 is/are rejected. 
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DETAILED ACTION 



1. Claims 1-21 have been examined. 



Papers Submitted 

2. It is hereby acknowledged that the following papers have been received and 
placed of record in the file: 

#3: Declaration (9/24/01) 

#4: Extension of time (9/24/01) 



Drawings 



3. The drawings are objected to under 37 CFR 1 .83(a). The drawings must show 
every feature of the invention specified in the claims. Therefore, the counters of claim 1 
and compiler of claim 12 must be shown or the feature(s) canceled from the claim(s). 
No new matter should be entered. 

A proposed drawing correction or corrected drawings are required in reply to the 
Office action to avoid abandonment of the application. The objection to the drawings 
will not be held in abeyance. 
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Specification 



4. The abstract of the disclosure is objected to because it discloses that an 
instruction of the first instruction set is executed in response to a branch instruction of 
the first instruction set, and that control signals of the second instruction set are 
executed in response to a branch instruction of the second instruction set. However, 
the Office, upon careful inspection of the detailed description of the invention, believes 
this disclose to be inaccurate. The specification states that an instruction of the first 
instruction set is executed in response to a branch instruction of the second instruction 
set, and that control signals of the second instruction set are executed in response to a 
branch instruction of the first instruction set (pg. 1 1 , lines 13-17; pg. 12-13, lines 22 and 
1-2). Correction is required. See MPEP § 608.01(b). 

5. The disclosure is objected to because of the following informalities: 

1) The Summary of the invention on pg. 4, lines 4-7, discloses that an instruction 
of the first instruction set is executed in response to a branch instruction of the first 
instruction set, and that control signals of the second instruction set are executed in 
response to a branch instruction of the second instruction set. However, the Office, 
upon careful inspection of the detailed description of the invention, believes this disclose 
to be inaccurate. The specification states that an instruction of the first instruction set is 
executed in response to a branch instruction of the second instruction set, and that 
control signals of the second instruction set are executed in response to a branch 
instruction of the first instruction set (pg. 11, lines 13-17; pg. 12-13, lines 22 and 1-2). 
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2) The Office is confused regarding the execution of an instruction of the second 
instruction set. On pg. 11, lines 13-17, it is disclosed that an "unconditional switch 
branch instruction of the primary instruction form", when detected, causes the alternate 
form of instructions to be fetched. There is no description of the "unconditional switch 
branch instruction of the primary instruction form" This leads to confusion regarding the 
origin of the instruction, i.e. if it is a special instruction inserted by the compiler etc. 
Please clarify. 

3) There is a grammatical error on pg. 12, line 8, 

4) The title of the invention is not descriptive. A new title is required that is 
clearly indicative of the invention to which the claims are directed. 

The following title is suggested: SYSTEM AND METHOD INCLUDING 
DISTRIBUTED INSTRUCTION BUFFERS HOLDING PRE-DECODED 
INSTRUCTIONS OF THE SECOND INSTRUCTION FORM IDENTIFIED BY THE 
COMPILER AS FREQUENTLY EXECUTED INSTRUCTIONS. 



Appropriate correction is required. 
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Claim Objections 



6. Claims 12 and 21 are objected to because of tlie following informalities: These 
claims disclose that the processor comprises a compiler. The compiler is software that 
compiles the processor's instructions. The compiler may be stored in the memory of the 
processor or elsewhere. However no such memory is disclosed in the claims 
concerned. Appropriate correction is required. 

7. Claim 15 and 18 is objected to because of the following informalities: Both claims 
refer to the limitation of "a buffer". This leads to confusion due to the limitation of "a 
plurality of buffers" in claim 1 1 . Please make appropriate corrections. 



Claim Rejections - 35 USC § 103 



8. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed 
or described as set forth in section 102 of this title, if the differences between the 
subject matter sought to be patented and the prior art are such that the subject 
matter as a whole would have been obvious at the time the invention was made 
to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was 
made. 
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9. Claims 1, 5-9 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Soni (US006223254B1) in view of Chan (US005317745A). 

10. In regard to claim 1 : 

1 1 . Soni teaches a method for processing a first instruction form (fig. 3, instructions 
stored in instruction cache 40; col. 8, lines 28-29) and a second instruction form (fig. 3, 
decoded instructions stored in parcel cache 52; col. 9, lines 20-23) of an instruction set 
in a processor comprising the steps of: 

storing a plurality of instructions of the second form (decoded instructions [col. 9, 
lines 20-23] are stored in a parcel cache 52) proximate to a plurality of execution units 
(fig. 3 shows the parcel cache proximate to the execution units); 

executing at least one instruction of the first instruction form in response to a first 
counter (Although not explicitly mentioned, it is deemed inherent to have a program 
counter for fetching instructions from memory which are to be subsequently executed 
othenA^ise the processor would not know from which location to fetch an instruction 
from); and 

executing at least one instruction of the second instruction form (col. 9, lines 30- 
33, 11-13). 

12. Soni differs from the instant invention in that while it does store a plurality of 
instructions of the second form in a buffer (parcel cache) proximate to the execution 
units, it does not store them in a plurality of buffers and furthermore although it must 
execute at least one instruction of the second form (decoded instruction) inherently in 
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response to a certain program counter invoked by a branch instruction of the first form 
(fig. 4 shows that the decoded instructions of the REP LODS AL instruction are stored in 
the parcel cache. These decoded instructions in the parcel cache 52 are executed when 
the branch instruction (JNZ.CC.XXXX) of the first instruction form in the instruction 
cache 40 targets the REP LODS AL instruction [col. 12, lines 55-58]), but it does not 
specifically mention that the second instruction form is executed in response to at least 
one second counter, wherein the second counter is invoked by a branch instruction 
of the first form. 

13. "Official Notice" is taken that it is well known and expected in the art that a buffer 
split into a plurality of smaller buffers is has the benefit of less complex indexing circuitry 
leading to faster lookups. 

14. Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have modified the parcel cache by splitting it into a plurality of buffers. 

15. One would have been motivated to do so to benefit from faster lookup times and 
hence faster processing. 

16. However this combination still differs from the invention because it does not 
teach the second counter. 

17. Chan teaches that by using a general program counter for the main program and 
an alternate program counter for a subroutine, latency in stack processing and therefore 
switching between program counters can be cut drastically (col. 2, lines 42-49; col. 8, 
lines 32-34). 
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18. Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have used a second (program) counter for the instructions of the 
second form which are stored in the parcel cache. 

19. One would have been motivated to do so because it would allow for lower 
latency in switching between program counters and therefore processing speed. 



20. In regard to claim 5: 

21 . Soni discloses the method of claim 1 , wherein the step of executing at least one 
instruction of the second instruction form further comprises the steps of: 

fetching at least one instruction in the buffers (parcel (or decoded instruction) is 
fetched from parcel cache and sent to the reservation station [col. 9, lines 29-34]); and 

sequencing a plurality of control signals to the execution units (Although not 
mentioned explicitly, col. 9, lines 1 1-13 disclose that the instructions are sent from the 
reservation station to the execution units and this would involve sequencing a plurality 
of control signals for proper scheduling of the instructions). 

22. In regard to claim 6: 

23. Soni discloses the method of claim 1 , wherein the second instruction form is a 
logical subset of the first instruction form (col, 9, lines 20-23 and col. 10, lines 1-2 
indicate that the second instruction form is a decoded version of the first instruction form 
hence making it a logical subset of the first instruction form). 
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24. in regard to claim 7: 

25. Soni discloses the method of claim 1 , wherein the step of executing at least one 
instruction of the first instruction form further comprises the steps of: 

fetching an instruction of the first form from a memory (col. 5, lines 40-49); 
decoding the instruction (col. 5, lines 55-56); and 

issuing the decoded instruction to at least one execution unit (col. 5, lines 55-60). 

26. In regard to claim 8: 

27. Soni differs from the limitations of claim 8, namely it does not posses a switch bit 
to signal return to fetching of the first instruction form but discloses that a return to 
fetching of the first instruction form is signaled by a return instruction of the second 
instruction form stored in a buffer of a branch unit (Although not explicitly mentioned, it 
is deemed inherent to the processor that a return instruction (a type of branch 
instruction), executed and hence stored in a buffer of a branch unit, would address an 
instruction to be fetched i.e. signal fetching of the first instruction form because a return 
instruction commonly occurs after the execution of a loop (frequently executed 
instructions of the second form) which would instruct the processor to fetch from a 
section of code that is not frequently executed i.e. first instruction form instructions. This 
would result in a "hit" in the memory and not in the parcel cache col. 9, lines 31-33]). 



Application/Control Number: 09/845,693 Page 10 

Art Unit: 2183 

28. "Official Notice" is taken that it is well known and expected in the art to have a 
status bit indicating the state of a signal (e.g. a Zero bit of a status register) to simplify 
processing of the signal. 

29. It would have been obvious to one of ordinary skill In the art at the time of the 
invention to add a switch bit indicating the control signal of the return instruction. 

30. One would have been motivated to do so in order to simplify processing and as it 
is common practice in the art. 



31. In regard to claim 9: 

32. Soni discloses the method of claim 1 , wherein a return to fetching of the first 
instruction form is signaled (col. 12, lines 45-46, the JUMP instruction indicates to fetch 
the next instruction i.e. the instruction of the first form after the REP LODS AL in the 
instruction cache) by a return instruction of the second instruction form stored in a buffer 
(fig. 4 shows the return instruction (JUMP) of the second instruction form in the parcel 
cache) of a branch unit. 

33. Claims 2-3 are rejected under 35 U.S.C. 103(a) as being unpatentable over Soni 
(US006223254B1) in view of Chan (US005317745A) as applied to claims 1, 5-9 above, 
and further in view of Ball and Larus ("Efficient Path Profiling," 29*^ Annual IEEE/ACM 
International Symposium on Microarchitecture, Paris, pp. 46-57, 1996). 
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34. In regard to claim 2: 

35. Soni differs from the instant invention because the instructions of the first form 
and instructions of the second form are generated by the processor based on 
execution frequency (col. 2, 43-47; col. 7, lines 22-25; LRU (Least Recently Used) 
algorithm results in the parcel cache holding instructions that are executed more 
frequently) and not by the compiler. 

36. Ball et al. teach that path profiling can be used to identify heavily executed paths 
in a program (col. 2, lines 22-23). They also teach that their efficient path profiling 
technique can be used for program optimization and performance tuning (col. 3, lines 
20-21). 

37. Therefore it would have been obvious to one of ordinary skill in the art to remove 
the processor hardware that executes the LRU algorithm to store decoded instructions 
in the parcel cache and use a compiler performing efficient path profiling instead to fill 
the parcel cache. 

38. One would have motivated to do so because by using the compiler to perform the 
function of the hardware, hardware is reduced and this translates to savings in die area 
and cost. 
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39. In regard to claim 3: 

40. Soni discloses that the second form of instructions are more frequently executed 
than the instructions of the first form (col. 2, 43-47; col. 7, lines 22-25; LRU (Least 
Recently Used algorithm results in the parcel cache holding instructions that are 
executed more frequently). 

41. Claims 4 and 10 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Soni (US006223254B1) in view of Chan (US005317745A) as applied to claim 1, 5-9 
above, and further in view of Johnson ("Superscalar Microprocessor Design," Prentice 
Hall, 1991). 

42. In regard to claim 4: 

43. Soni differs from the instant invention in that he does not disclose the limitations 
for claim 4. namely a plurality of execution queues storing the first instruction form, de- 
gating the plurality of execution queues and pausing fetching from memory when 
executing at least one instruction of the second form. 

44. Johnson teaches that distributed reservation stations corresponding to separate 
functional units, as compared to a centralized reservation station/window design, have 
the benefit of less complex circuitry because you need to select among a less number 
of instructions to issue, no need to for arbitration circuitry as only one instruction is 
issued from a distributed reservation station, and it does not need to be able to hold all 
instruction-types; only the type specific to its functional unit (pg. 134, lines 1-22). 
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45. "Official Notice" is taken that it is well known and expected that a smaller 
reservation station reduces complexity and minimizes lookup time. 

46. Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to modify the centralized execution queue (reservation station, fig. 3) by 
separating it into a plurality of queues. 

47. It would also have been obvious to one of ordinary skill in the art at the time of 
the invention to modify the processor by providing the instructions of the second form 
(decoded instructions) in the parcel cache directly to the execution units (fig. 3, 

60,61 ,71 ,80,90) as taught by Soni (col. 10, lines 29-31 ; "other pipeline stage") instead of 
the resen/ation station because this would lead to a smaller reservation station as it 
would not require as many entries. Inherently this would require de-gating the execution 
queues (reservation stations) and stopping fetching from a memory to prevent the 
instructions of the first type from executing. 

48. One would have been motivated to make these modifications because it would 
lead to less complex circuitry and hence faster processing as taught by Johnson. 

49. In regard to claim 10: 

50. Soni (US006223254B1) in view of Chan (US005317745A) as applied to claim 1 
teaches a plurality of buffers but does not teach that each execution unit is associated 
with one buffer. 

51 . However, it would have been obvious to one of ordinary skill in the art at the time 
of the invention to design the processor such that each execution unit is associated with 
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one buffer by distributing the parcel cache into as buffers as execution units and having 
one for each. 

52. One would have been motivated to do so because from the teachings of 
Johnson, circuit complexity is reduced if you have a different buffer for every execution 
unit as no need to for arbitration circuitry as only one instruction is issued from a buffer 
and each buffer has to be able to store only one type of decoded instruction-type. 

53. Claims 11,13, 16-20 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Soni (US006223254B1). 

54. In regard to claim 1 1 : 

55. Soni discloses a processor for processing a first instruction form (fig. 3, 
instructions stored in instruction cache 40; col. 8, lines 28-29) and a second instruction 
form (fig. 3, decoded instructions stored in parcel cache 52; col. 9, lines 20-23) of an 
instruction set comprising: 

a plurality of execution units for receiving instructions (fig. 3, 60,61,71,80,90); 

a branch unit (col. 8. lines 23-26, 35-38; fig. 3, BTB 42) connected to an 
instruction fetch unit (col. 8, lines 28-30; fig. 3, instruction streaming buffer 53) for the 
first instruction form and a sequencer (col. 9, lines 29-31; fig. 3, instruction streaming 
buffer 53) for the second instruction form; 

a decode unit for decoding instructions of the first instruction form into control 
signals for the execution units (col. 5, lines 55-56; fig. 3, 54,55, 45-49). 
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56. Soni differs from the current invention because it does not disclose the limitation 
of having a plurality of buffers but discloses only a single buffer (parcel cache 52), 
proximate to the execution units (fig. 3 shows the parcel cache proximate to the 
execution units), for storing predecoded instmctions of the second instruction form (col. 
5, lines 55-56). 

57. "Official Notice" is taken that it is well known and expected in the art that a buffer 
split into a plurality of smaller buffers is has the benefit of less complex indexing circuitry 
leading to faster lookups. 

58. Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have modified the parcel cache by splitting it into a plurality of buffers 

59. One would have been motivated to do so to benefit from faster lookup times and 
hence faster processing. 

60. In regard to claim 13: 

61 . Soni discloses the processor of claim 1 1 , wherein the sequencer (fig. 3, 
instruction streaming buffer 53), engaged by the branch unit (fig. 3, BTB 42), addresses 
the decoded instructions of the second instruction form stored in the buffers and 
sequences predecoded instructions of the second instruction form to the execution unit 
(the instruction streaming buffer fetches instructions from the parcel cache on a "hit" and 
sends them to the reservation stations to be executed [col. 9, lines 29-33, 12-13]). 
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62. In regard to claim 16: 

63. Soni discloses the processor of claim 1 1 , wherein the branch unit switches the 
processor from the first instruction form to the second instruction form in response to a 
branch instruction of the first instruction form (fig. 4 shows that the decoded instructions 
of the REP LODS AL instruction are stored in the parcel cache. These decoded 
instructions in the parcel cache 52 are executed when the branch instruction 
(JNZ.CC.XXXX) of the first instmction form in the instruction cache 40 targets the REP 
LODS AL instruction [col. 12, lines 55-58]). 

64. In regard to claim 17: 

65. Soni discloses the processor of claim 1 1 , wherein the branch unit switches the 
processor from the second instruction form to the first instruction form in response to a 
branch instruction of the second instruction form (fig. 4 shows the return instruction 
(JUMP) of the second instruction form in the parcel cache and col. 12, lines 45-46, the 
JUMP instruction indicates to fetch the next instruction i.e. the instruction of the first 
form after the REP LODS AL in the instruction cache). 

66. In regard to claim 18: 

67. Soni differs from the limitations of claim 18, namely it does not posses a switch 
bit to signal the sequencer to stop fetching from the buffers and enable fetching of the 
first instruction form from the memory but discloses that a return to fetching of the first 
instruction form from the memory is signaled by a return instruction of the second 
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instruction form executed in the branch unit (Although not explicitly mentioned, it is 
deemed inherent to the processor that a return instruction (a type of branch instruction), 
executed in the branch unit, would address an instruction to be fetched i.e. signal 
fetching of the first instruction form because a return instruction commonly occurs after 
the execution of a loop (frequently executed instructions of the second form) which 
would instruct the processor to fetch from a section of code that is not frequently 
executed i.e. first instruction form instructions. This would result in a "hit" in the memory 
and not in the parcel cache col. 9, lines 31-33]). 

68. "Official Notice" is taken that it is well known and expected in the art to have a 
status bit indicating the state of a signal (e.g. a Zero bit of a status register) to simplify 
processing of the signal. 

69. It would have been obvious to one of ordinary skill in the art at the time of the 
invention to add a switch bit indicating the control signal of the return instruction. 

70. One would have been motivated to do so in order to simplify processing and as it 
is common practice in the art. 

71. In regard to claim 19: 

72. Soni discloses the processor of claim 1 1 , wherein the execution bandwidth of the 
execution units (fig. 3 shows 5 execution units 60-61 ,71 ,80 and 90) is larger than the 
fetch/issue bandwidth of the first form (fig. 3 shows that one instruction is issued to the 
reservation station 50 but five instruction can be executed in parallel by the execution 
units). 
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73. In regard to claim 20: 

74. Soni discloses the processor of claim 1 1 , wherein the second instruction form is 
a logical subset of the first instruction form (col. 9, lines 20-23 and col. 10, lines 1-2 
indicate that the second instruction form is a decoded version of the first instruction form 
hence making it a logical subset of the first instruction form). 

75. Claim 12 is rejected under 35 U.S.C. 103(a) as being unpatentable over Soni 
(US006223254B1) as applied to claims 11,13, 16-20 above, and further in view of Ball 
and Larus ("Efficient Path Profiling," 29*^ Annual IEEE/ACM International Symposium on 
Microarchitecture, Paris, pp. 46-57, 1996). 

76. In regard to claim 12: 

77. Soni discloses that the second form of instructions are more frequently executed 
than the instructions of the first form (col. 2, 43-47; col. 7, lines 22-25; LRU (Least 
Recently Used algorithm results in the parcel cache holding instructions that are 
executed more frequently). 

78. Soni differs from the instant invention because the instructions of the first form 
and instructions of the second form are generated by the processor based on 
execution frequency (col. 2, 43-47; col. 7, lines 22-25; LRU (Least Recently Used) 
algorithm results in the parcel cache holding instructions that are executed more 
frequently) and not by the compiler. 
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79. Ball et al. teach that path profiling can be used to identify heavily executed paths 
in a program (col. 2, lines 22-23). They also teach that their efficient path profiling 
technique can be used for program optimization and performance tuning (col. 3, lines 
20-21). 

80. Therefore it would have been obvious to one of ordinary skill in the art to remove 
the processor hardware that executes the LRU algorithm to store decoded instructions 
in the parcel cache and use a compiler performing efficient path profiling instead to fill 
the parcel cache. 

81 . One would have motivated to do so because by using the compiler to perform the 
function of the hardware, hardware is reduced and this translates to savings in die area 
and cost. 

82. Claims 14 and 15 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Soni (US006223254B1) as applied to claims 11, 13, 16-20 above, and further in 
view of Johnson ("Superscalar Microprocessor Design," Prentice Hall, 1991). 

83. In regard to claim 14: 

84. Soni differs from the instant invention in that he does not disclose the limitations 
for claim 14, namely a plurality of execution queues storing the decoded instructions of 
the first instruction form and the sequencer connected to and controlling a plurality of 
gates between the execution queues and execution units. 
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85. Johnson teaches that distributed reservation stations corresponding to separate 
functional units, as compared to a centralized reservation station/window design, have 
the benefit of less complex circuitry because you need to select among a less number 
of instructions to issue, no need to for arbitration circuitry as only one instruction is 
issued from a distributed reservation station, and it does not need to be able to hold all 
instruction-types; only the type specific to its functional unit (pg. 134, lines 1-22). 

86. "Official Notice" is taken that it is well known and expected that a smaller 
reservation station reduces complexity and minimizes lookup time. 

87. Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to modify the centralized execution queue (resen/ation station, fig. 3) by 
separating it into a plurality of queues. 

88. It would also have been obvious to one of ordinary skill in the art at the time of 
the invention to modify the processor by providing the instructions of the second form 
(decoded instructions) in the parcel cache directly to the execution units (fig. 3, 

60,61 ,71 ,80,90) as taught by Soni (col. 10, lines 29-31 ; "other pipeline stage") instead of 
the resen/ation station because this would lead to a smaller reservation station as it 
would not require as many entries. Inherently this would require the sequencer (fig. 3, 
instruction streaming buffer 53), which is responsible for sending the instructions of the 
second form to be executed to disconnect the execution queues (reservation stations) 
from the execution units to prevent the instructions of the first type from executing. A 
plurality of gates connected between the execution queues and the execution units 
would be required for this purpose. 
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89. One would have been motivated to make these modifications because it would 
lead to less complex circuitry and hence faster processing as taught by Johnson. 

90. In regard to claim 15: 

91 . Soni (US006223254B1 ) as applied to claim 1 teaches a plurality of buffers but 
does not teach that each execution unit is associated with one buffer. 

92. However, it would have been obvious to one of ordinary skill in the art at the time 
of the invention to design the processor such that each execution unit is associated with 
one buffer by distributing the parcel cache into as buffers as execution units and having 
one for each. 

93. One would have been motivated to do so because, from the teachings of 
Johnson, circuit complexity is reduced if you have a different buffer for every execution 
unit as no need to for arbitration circuitry as only one instruction is issued from a buffer 
and each buffer has to be able to store only one type of decoded instruction-type. 
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94. Claim 21 is rejected under 35 U.S.C. 103(a) as being unpatentable over Soni 
(US006223254B1) in view of Ball and Larus ("Efficient Path Profiling." 29*^ Annual 
IEEE/ACM International Symposium on Microarchitecture, Paris, pp. 46-57, 1996) and 
Johnson ("Superscalar Microprocessor Design," Prentice Hall, 1991). 

95. In regard to claim 21 : 

96. Soni discloses a processor for processing a first instruction form (microprocessor 
instruction col. 8, lines 28-29) and a second instruction form (decoded instructions col. 
9, lines 20-23) of an instruction set comprising: 

a plurality of execution units for receiving instructions (fig. 3, 60,61 ,71 ,80,90); 

97. a branch unit (col. 8, lines 23-26, 35-38; fig. 3, BTB 42) connected to an 
instruction fetch unit (col. 8, lines 28-30; fig. 3, instruction streaming buffer 53) for the 
first instruction form, wherein the branch unit switches the processor from the first 
instruction form to the second instruction form in response to a branch instruction of the 
first instruction form (fig. 4 shows that the decoded instructions of the REP LODS AL 
instruction are stored in the parcel cache. These decoded instructions in the parcel 
cache 52 are executed when the branch instruction (JNZ.CC.XXXX) of the first 
instruction form in the instruction cache 40 targets the REP LODS AL instruction [col. 12, 
lines 55-58]) and switches the processor from the second instruction form to the first 
instruction form in response to a branch instruction of the second instruction form (fig. 4 
shows the return instruction (JUMP) of the second instruction form in the parcel cache 
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and col. 12, lines 45-46, the JUMP instruction indicates to fetch the next instruction i.e. 
the instruction of the first form after the REP LODS AL in the instruction cache). 

a decode unit for decoding instructions of the first instruction form into 
instructions for the execution units (col. 5, lines 55-56; fig. 3, 54,55, 45-49). 

an issue unit adapted to sequence decoded instructions of the first instruction 
form (col. 5, lines 40-45, 56-60); 

the instructions of the second form are more frequently executed than the 
instructions of the first form (col. 2, 43-47; col. 7, lines 22-25; LRU (Least Recently Used 
algorithm results in the parcel cache holding instructions that are executed more 
frequently). 

the sequencer (fig. 3, instruction streaming buffer 53), engaged by the branch 
unit (fig. 3, BTB 42), adapted to fetch the predecoded instructions and sequence the 
predecoded instruction of the second instruction form (the instruction streaming buffer 
fetches instructions from the parcel cache on a "hit" and sends them to the reservation 
stations to be executed [col. 9, lines 29-33, 12-13]). 
98. Soni differs from the current invention because 

he does not disclose a plurality of buffers but discloses only a single buffer 
(parcel cache 52), proximate to the execution units, for storing predecoded instructions 
of the second instruction form (col. 5, lines 55-56), 

the instructions of the first form and instructions of the second form are 
generated by the processor based on execution frequency (col. 2, 43-47; col. 7, lines 
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22-25; LRU (Least Recently Used) algorithm results in the parcel cache holding 
instructions that are executed more frequently) and not by the compiler, and 

he does not disclose a plurality of execution queues storing the decoded 
instructions of the first instruction form and the sequencer connected to and controlling 
a plurality of gates between the execution queues and execution units 

99. "Official Notice" is taken that it is well known and expected in the art that a buffer 
split into a plurality of smaller buffers is has the benefit of less complex indexing circuitry 
leading to faster lookups. 

100. Ball et al. teach that path profiling can be used to identify heavily executed paths 
in a program (col. 2, lines 22-23), They also teach that their efficient path profiling 
technique can be used for program optimization and peri'ormance tuning (col. 3, lines 
20-21). 

1 01 . Johnson teaches that distributed reservation stations corresponding to separate 
functional units, as compared to a centralized reservation station/window design, have 
the benefit of less complex circuitry because you need to select among a less number 
of instructions to issue, no need to for arbitration circuitry as only one instruction is 
issued from a distributed reservation station, and it does not need to be able to hold all 
instruction-types; only the type specific to its functional unit (pg. 134, lines 1-22). 

102. "Official Notice" is taken that it is well known and expected that a smaller 
reservation station reduces complexity and minimizes lookup time. 

103. Therefore it would have been obvious to one of ordinary skill in the art at the time 
of the invention to have modified the parcel cache by splitting it into a plurality of buffers. 
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104. One would have been motivated to do so to benefit from faster lookup times and 
hence faster processing. 

1 05. It would have also therefore been obvious to one of ordinary skill in the art to 
remove the processor hardware that executes the LRU algorithm to store decoded 
instructions in the parcel cache and use a compiler performing efficient path profiling 
instead to fill the parcel cache. 

106. One would have motivated to do so because by using the compiler to perform the 
function of the hardware, hardware is reduced and this translates to savings in die area 
and cost. 

107. Furthermore it would have also been obvious to one of ordinary skill in the art at 
the time of the invention to modify the centralized execution queue (reservation station, 
fig. 3) by separating it into a plurality of queues. 

108. It would also have been obvious to one of ordinary skill in the art at the time of 
the invention to modify the processor by providing the instructions of the second form 
(decoded instructions) in the parcel cache directly to the execution units (fig. 3, 

60,61 ,71 ,80,90) as taught by Soni (col. 10, lines 29-31 ; "other pipeline stage") instead of 
the reservation station because this would lead to a smaller reservation station as it 
would not require as many entries. Inherently this would require the sequencer (fig. 3, 
instruction streaming buffer 53), which is responsible for sending the instructions of the 
second form to be executed to disconnect the execution queues (reservation stations) 
from the execution units to prevent the instructions of the first type from executing. A 
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plurality of gates connected between the execution queues and the execution units 
would be required for this purpose. 

109. One would have been motivated to make these modifications because it would 
lead to less complex circuitry and hence faster processing as taught by Johnson. 



Conclusion 



110. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. Applicant is reminded that in amending in reply to a rejection of 
claims in an application or patent under reexamination, the applicant or patent owner 
must clearly point out the patentable novelty, which he or she thinks the claims present 
in view of the state of the art disclosed by the references cited or the objections made. 
The applicant or patent owner must also show how the amendments avoid such 
references or objections. See 37 CFR §1.111. 

a. Asghar et al. (US006085314A) teach a gating mechanism which selects 
instructions from either an instruction cache or specialized instructions from a 
lookup table in fig. 9. 

b. Grochowski et al. (US006625756B1 ) teach a replay unit which holds 
decoded instructions in close proximity to the execution units and from which 
instructions are feed to the execution units under certain conditions. 
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c. Akkary et al. (US006247121 B1 ) teach a plurality of trace buffers which 
hold decoded instructions which are gated into the execution pipeline via a MUX 
gate. 

d. Chen (US006643736B1) teaches special scratch-pad memories for 
holding decoding instructions which are executed frequently to save space in the 
cache memory. 

e. Ibusaki et al. (US00561 5375A) teach a special decode cache which is 
useful when a loop is executed i.e. frequently occurring instructions. 

f. Keller et al. (US006502185B1 ) teaches a predecode cache (fig. 3). 
Any inquiry concerning this communication or earlier communications from the 

examiner should be directed to Amol V. Gole whose telephone number is 703-305- 
8888. The examiner can normally be reached on 9:00-6:30. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Eddie Chan can be reached on 703-305-9712. The fax phone number for 
the organization where this application or proceeding is assigned is 703-872-9306. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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